Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VarInt api rewrite #237

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

VarInt api rewrite #237

wants to merge 5 commits into from

Conversation

Vrtgs
Copy link

@Vrtgs Vrtgs commented Nov 6, 2024

Description

reduce code duplication, and have better code (no allocation during serialization (a very common operation), single calls to write instead of multiple) and pass VarInt by value, after all they are just an i32

Testing

I joined the minecraft server

and added a test that is ignored by default (i ran it tho) that double checks that I do serialize and deserialize the VarInts and VarLongs properly

Checklist

Things need to be done before this Pull Request can be merged.

  • Code is well-formatted and adheres to project style guidelines: cargo fmt
  • Code does not produce any clippy warnings cargo clippy

reduce code duplication, and have better code (no allocation during serialization, single calls to write instead of multiple) and pass VarInt by value, after all they are just an i32
Copy link
Contributor

@StripedMonkey StripedMonkey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I don't think this is a bad PR, and applaud your efforts! The VarInt API is definitely not in the best place right now, and I am for any improvements we can make to the API.

Alex or others may take issue with the overuse of macros (imo) for relatively marginal gains, but I'm alright with it for now.

I believe (and have been working on in #105 ) that VarInt shouldn't be a data type present at all, but a serialization format, essentially

#[serde(with = "var_int")]
len: i32

On any struct which requires an i32 to be formatted as a varint for the transport protocol. I'm quite busy IRL, and it's a long process. If you feel like taking a stab you're welcome to.

This is a good first step, I approve of all the changes except for the branding one.

Comment on lines +6 to +8
[dev-dependencies]
rayon = "1"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this getting added.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

varint and varlong tests, I tested the full range of numbers to be sure and without rayon it is very slow to say the least

pumpkin-protocol/src/var_int_helper.rs Show resolved Hide resolved
Comment on lines +196 to +202
impl std::ops::Add<$ty> for $var_int {
type Output = $var_int;

fn add(self, rhs: $ty) -> Self::Output {
Self(self.0 + rhs)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of adding some reduced subset of std::ops for no reason.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<VarInt> + 1 is pretty common, I could add the full set if that's better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VarInts are nothing but integers. The fact that +1, -1 and all the other operations aren't implemented is a nussance to anyone trying to use it. If you want to implement AsMut or AsRef you can get some of that for free.

Comment on lines -47 to +63
/// Cached Server brand buffer so we don't have to rebuild them every time a player joins
cached_server_brand: Vec<u8>,
}
pub struct CachedBranding(());

impl CachedBranding {
pub fn new() -> Self {
let cached_server_brand = Self::build_brand();
Self {
cached_server_brand,
}
Self(())
}

#[allow(clippy::unused_self)]
pub fn get_branding(&self) -> CPluginMessage {
CPluginMessage::new("minecraft:brand", &self.cached_server_brand)
}
fn build_brand() -> Vec<u8> {
let brand = "Pumpkin";
let mut buf = vec![];
let _ = VarInt(brand.len() as i32).encode(&mut buf);
buf.extend_from_slice(brand.as_bytes());
buf
static BRANDING: LazyLock<Box<[u8]>> = LazyLock::new(|| {
let brand = "Pumpkin";
let mut buf = vec![];
let _ = VarInt::try_from(brand.len()).unwrap().write_to(&mut buf);
buf.extend_from_slice(brand.as_bytes());
buf.into_boxed_slice()
});

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not ok with this change, and I think clippy pointed you towards part of why I'm not ok with the change.

For one thing, this forces all CachedBranding to be the same, and only initialized once. All branding is contained within one singleton, global brand which isn't obvious from the CachedBranding's API. If I were to see a CachedBranding being used elsewhere, it just wouldn't be obvious that I can construct one literally anywhere I'd like and it's the same everywhere.

For another reason, you've limited the scope of BRANDING to get_branding(), this means that it's completely impossible to modify the branding at runtime, despite it being lazily evaluated. There is nothing that can modify or update the branding later, and that I think is pretty pointless. If we're going to force it to be unmodifiable at runtime, we might as well make the whole expression const and compile time no?

If this static, singleton version of branding were to be done (which I'm not principally against) I would rather there be a static BRANDING: LazyLock<CachedBranding> =... lazy static rather than it be an internal implementation detail.

I'm not sure why clippy didn't warn you, but if you have an empty struct, just use struct CachedBranding;

tl;dr while the actual implementation of branding isn't 100% right now, this is solidly a step backwards towards either making branding compile time or towards making it modifiable at runtime.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to force it to be unmodifiable at runtime, we might as well make the whole expression const and compile time no?

we could, I just used the path of lowest resistance

I am not ok with this change, and I think clippy pointed you towards part of why I'm not ok with the change.
oh yeah for sure, I was going to talk about how a lot of api's seem hacked together

my idea isn't to make lots of drastic changes, thats why I didn't modify the api, and I think this should have been just a Branding struct, not a CachedBranding that is locked to just the value of pumpkin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you'd like to update this API that would be a separate PR IMHO. I don't like the interior statics and how the branding is now completely locked to a single function on a tuple struct. I would prefer it to not be static at all, but if it is static, this is not exactly flexible.

@suprohub
Copy link
Contributor

suprohub commented Nov 7, 2024

I want simd in varint

@StripedMonkey
Copy link
Contributor

I want simd in varint

I doubt varints are a good candidate for simd on their own considering the relatively short runs they make. It's unlikely to see much gains. You are welcome to prototype that however.

@suprohub
Copy link
Contributor

suprohub commented Nov 7, 2024

I want simd in varint

I doubt varints are a good candidate for simd on their own considering the relatively short runs they make. It's unlikely to see much gains. You are welcome to prototype that however.

https://lib.rs/crates/varint-simd

@suprohub
Copy link
Contributor

suprohub commented Nov 7, 2024

Screenshot_20241107-090941_Iceraven.png

Fastest encoer and decoder

@StripedMonkey
Copy link
Contributor

And if we were doing nothing but crunching millions of varints per second I would agree with you. Just like how I'd agree that SIMD would be faster if we were doing nothing but adding millions of numbers together in sequence.

SIMD requires significant "wind up time" to achieve those kinds of numbers. Because VarInts are actually somewhat sparsely laid out in the data parsing, it's unlikely we'll see major improvements simply because of that.

@StripedMonkey
Copy link
Contributor

Regardless, this is a conversation for elsewhere. Please open a new issue or discussion thread.

@Snowiiii
Copy link
Owner

Snowiiii commented Nov 7, 2024

I already see a lot of new unwraps..., I will take a look at this when i have time. But we generally should avoid unwraps

@Vrtgs
Copy link
Author

Vrtgs commented Nov 7, 2024

I already see a lot of new unwraps..., I will take a look at this when i have time. But we generally should avoid unwraps

Yeah I agree, but since the scope of this PR is supposed to be small I didn't want to rewrite sections, and if the unwrap are triggered that is a bug and it should be fixed, it's just that casting i32 to usize is a faliable operation

@Vrtgs
Copy link
Author

Vrtgs commented Nov 7, 2024

I want simd in varint

As it stands, I was able to squeeze most of encoding and decoding of varints, because the old implementations needlessly allocated and had multiple write calls, but this isn't really the point of the pr, I agree though it's very cool to have, the api is written in such a way now tho that we can swap the internal implementation any time and all varint encoding would use that new implementation

@Vrtgs
Copy link
Author

Vrtgs commented Nov 7, 2024

Overall, I don't think this is a bad PR, and applaud your efforts! The VarInt API is definitely not in the best place right now, and I am for any improvements we can make to the API.

Alex or others may take issue with the overuse of macros (imo) for relatively marginal gains, but I'm alright with it for now.

I believe (and have been working on in #105 ) that VarInt shouldn't be a data type present at all, but a serialization format, essentially

#[serde(with = "var_int")]
len: i32

On any struct which requires an i32 to be formatted as a varint for the transport protocol. I'm quite busy IRL, and it's a long process. If you feel like taking a stab you're welcome to.

This is a good first step, I approve of all the changes except for the branding one.

And honestly I agree, varint is an encoding standard, and I thought about it too, that rather than a datatype that varint should just be a module with all of the methods in varint as just simple functions, but I didn't want to rewrite serialization as well, do I opted for the less invasive option

@@ -49,7 +49,7 @@ impl PacketEncoder {

self.compress_buf.clear();

let data_len_size = VarInt(data_len as i32).written_size();
let data_len_size = VarInt::try_from(data_len).unwrap().written_size();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You added a bunch of unwraps in packet_encoder. This looks pretty dangerous. The first rule in Networking is to never trust the Client.

.map(|len| match len {
1 => Some(byte),
0 => None,
_ => unreachable!(),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is unreachable?

}
}

macro_rules! impl_var_int {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why you decided to use macros

@Snowiiii
Copy link
Owner

Snowiiii commented Nov 7, 2024

I really dislike having .get() now everywhere. @StripedMonkey solution of using

#[serde(with = "var_int")]
my_var_int: i32

Sounds really nice, This lets us directly use the field and we also don't have to implement any std::ops:: Things

@StripedMonkey
Copy link
Contributor

StripedMonkey commented Nov 7, 2024

Like was mentioned, it requires a lot of rework to make the packets use Serde over the funky encode/decode stuff that already exists (which is equivalent to a fn (de)serialize(...))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants